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James  N.  Eagle  Alan  R.  Washburn 

Department  of  Operations  Research 

Naval  Postgraduate  School 

Monterey,  CA  93943 

CSEGs  are  search-evasion  games  where  play  proceeds  throughout  some  specified  period  without 
any  interim  feedback  to  either  of  the  two  players,  each  of  whom  is  assumed  to  move  according  to 
some  preselected  plan.  If  (Xt,Yt)  are  the  positions  of  the  two  players  at  time  t,  then  the  payoff  is 
N  =  X)t=i  M^XuYtit)'  That  is,  the  payoff  is  a  cumulative  score  over  the  time  intervals  1, . . .  ,  T. 
One  possibility  is  that  A(Xt,Yt,t)  is  an  indicator  of  the  event  Xt  =  Yt,  in  which  case  N  is  the 
total  number  of  coincidences.  This  is  the  definition  that  motivated  the  class,  but  it  is  not  the  only 
possibility. 

Both  players  are  assumed  to  move  among  some  finite  set  of  cells  C.  Initial  positions  in  C 
are  determined  by  probability  distributions  which  are  known  to  both  players.  One  possibility  is 
that  the  distributions  correspond  to  specific  starting  cells.  This  will  be  the  case  in  a  subsequent 
example,  but,  again,  it  is  not  required. 

Given  his  initial  position  X\  and  the  assumed  distribution  for  player  2's  initial  cell,  player  1 
must  select  a  feasible  track  Xi,  X2, . . . , Xj.  A  track  is  feasible  if  A'<+i  lies  in  the  given  set  S(Xt.  t ) 
for  1  <  t  <  T  —  1.  These  tracks  are  player  l's  pure  strategies.  Likewise  Yt+i  must  lie  in  the  given 
set  E(Yt,t)  for  1  <  t  <  T  —  1.  Generally  S(i,t)  and  E(i,t)  include  cell  i  and  some  of  its  neighbors, 
the  idea  being  that  feasible  tracks  should  connect  neighboring  cells.  The  payoff  is  determined  once 
the  tracks  are  selected  by  both  players.  Player  1  attempts  to  maximize  the  expected  payoff,  /.[.V;. 
and  player  2  to  minimize.  Given  the  interpretation  of  the  problem,  it  is  natural  to  expect  optimal 
strategies  for  both  sides  to  be  mixed. 

1.  Discussion  and  Motivation 

One  might  prefer  to  consider  asimilar  class  of  games  where  the  pure  strategy  payoff  is  1  — e_A  .  since 
that  quantity  can  be  interpreted  as  a  detection  probability  if  A(x,  yj)  is  "detection  rate  at  time  t" 
(Koopman  (1980)).  Alternatively  one  might  let  the  payoff  be  "time  to  the  first  detection."  as  in 
Ruckle's  (1983)  Pursuit  on  a  Graph  game.  Such  detection  games  are  of  considerable  operational 
interest.  Single  player  versions  where  player  2's  motion  is  according  to  a  specijitd  Markov  pn 
have  been  considered  by  Stewart  (1979,1980),  Eagle  (1984),  and  Trummel  and  Weisinger  (1981 
and  there  is  a  more  extensive  literature  (Stone  (1989))  if  the  searcher's  path  is  not  constrained.  It 
would  indeed  be  satisfying  to  find  an  efficient  method  for  solving  the  corresponding  detection  games 
where  the  evader's  path  is  not  probabilistically  specified,  and  where  he  can  thus  more  completely 
live  up  to  his  title.  Unfortunately,  the  methods  to  be  introduced  later  are  tailored  to  the  payoff  N, 


rather  than  l-e-A.  Of  course  l-e~A  is  approximately  equal  to  AT  when  N  is  small,  so  CSEGs  can 
be  regarded  as  first  order  approximations  to  detection  games.  The  scale  of  A(-,-,-)  is  immaterial 
in  solving  a  CSEG,  but  the  validity  of  the  approximation  to  a  detection  game  will  be  best  when 
A(-,  •,  •)  is  small. 

Direct  motivation  of  CSEGs  is  also  possible.  There  are  a  variety  of  reasons  why  the  results  of 
search  might  not  be  known  until  it  is  over.  Photographic  film  might  have  to  be  developed  or  nets 
hauled  in.  Another  possible  application  is  search  planning  for  autonomous  vehicles;  for  example, 
an  over-the- horizon  unmanned  aircraft  whose  track  must  be  specified  before  launch.  Also,  there  is 
no  real  reason  in  CSEGs  for  restricting  the  two  sides  to  consist  of  a  single  agent  each.  The  two 
sides  might  be  teams  or  even  armies,  one  seeking  contact  and  the  other  desirous  of  avoiding  it. 
The  "no  feedback"  restriction  might  then  be  viewed  as  a  natural  consequence  of  the  difficulties  of 
communication  in  the  field. 

Although  the  payoff  in  a  CSEG  has  the  same  form  as  in  a  Multi-Stage  Game  (Thomas(1984)), 
CSEGs  are  not  MSGs.  To  make  an  MSG  out  of  a  CSEG  one  would  have  to  reveal  the  position 
of  each  player  to  the  other  after  each  move,  so  that  the  joint  position  could  serve  as  a  "state." 
Although  such  games  are  interesting,  they  are  not  what  we  have  in  mind  here. 

2.   Initial  Observations 

CSEGs  are  finite,  two-person  zero-sum  games,  so  solutions  certainly  exist.  The  straightforward  way 
to  proceed  would  be  to  list  all  feasible  tracks  and  then  use  linear  programming  to  find  the  optimal 
probabilities  for  each  track.  The  difficulty  with  this  is  that  the  number  of  feasible  tracks  explodes 
rapidly  with  the  size  of  the  problem.  If  the  sets  S(i,t)  all  have  three  elements,  and  if  the  initial 
distribution  for  player  l's  position  is  also  concentrated  on  three  points,  there  are  3T  pure  strategies 
for  player  1.  This  kind  of  exponential  growth  makes  the  "brute  force" approach  impractical  for 
even  moderately  sized  problems.  The  object  must  be  to  take  advantage  of  the  special  structure  of 
CSEGs  to  develop  more  efficient  methods. 

A  mixed  strategy  for  either  player  is  a  discrete  probability  distribution  over  the  possible  feasible 
tracks.  Given  mixed  strategies  for  players  1  and  2,  let  p(i,t)  be  the  marginal  probability  that 
player  1  visits  cell  i  at  time  /.  Likewise  let  q(i,t)  be  the  corresponding  probability  for  player  2. 
Then  since  the  expectation  of  a  sum  is  the  sum  of  expectations,  and  since  the  two  players  choose 
their  strategies  independently, 

T 
*W  =  £  E  A{i,ht)p{i,t)q{j,t). 

This  payoff  depends  only  on  the  marginal  distributions  p(-,  •)  and  q(-,  •),  so  there  is  the  possibility  of 
an  analysis  based  directly  on  them,  rather  than  on  the  mixed  strategies  themselves.  Furthermore, 
when  p(v)  *s  g?ven<  player  2's  problem  in  selecting  an  optimal  track  is  a  T-period  shortest  path 


problem,  a  relatively  simple  type  that  can  be  solved  quickly  even  for  large  problems.  To  see  this,  lei 
D(j,t)  =  Y!>iec  ^('»i»*)p(*»0  be  tne  penalty  associated  with  visiting  cell  j  at  time  /.  1  hen  player  2 
wants  to  find  a  feasible  track  that  visits  the  cells  in  such  a  manner  as  to  minimize  the  sum  of  all 
T  such  penalties,  a  shortest  path  problem  that  can  easily  be  solved  using  dynamic  programming. 
Given  a  mixed  strategy  for  player  1,  this  shortest  path  solution  gives  a  lower  bound  on  the  value 
of  the  game.  Similar  comments  hold  concerning  player  l's  selection  of  a  track  when  </(•,  •)  is  gi 
The  fact  that  a  lower  bound  on  the  value  of  the  game  is  determined  by  specifying  p(-,  •)  and  solving 
a  shortest  path  problem,  and  that  an  upper  bound  is  found  by  specifying  q(-,  •)  and  solving  a  longest 
path  problem  will  prove  invaluable  in  the  techniques  to  be  discussed  in  the  following  sections. 

CSEGs  often  have  a  "turnpike"  property  (Whittle  (1983))  in  the  sense  that  optimal  marginal 
distributions  are  attracted  to  a  certain  equilibrium  pair  (p*(-,  •),  (]*(•,  ■)).  More  precisely,  let  v(t)  be 
the  value  of  the  one-period  matrix  game  A(-,-,t),  and  let  p*(-,t)  and  q"(-,t)  be  optimal  mixed 
strategies  for  the  two  players,  unrestricted  except  that  each  must  be  a  discrete  probability  distri- 
bution over  the  cells  in  C.  If  p*(\t)  and  <?*(-,/)  are  feasible  marginal  distributions  for  each  time 
period  of  a  T-period  CSEG,  then  they  must  also  be  optimal.  Furthermore,  the  value  of  the  CSEG 
is  Xa=i  v(0-  In  general  the  feasibility  requirement  will  fail  because  p(-J)  and  q{-,t)  are  required 
by  the  path  constraints  to  resemble  the  initial  distributions  for  small  values  of  t.  However,  we  can 
say 

Theorem  1.  Suppose  p(-,-)  and  <?(-,•)  are  optimal  for  the  T\-period  CSEG,  suppose  T2  >  T\,  and 
let 

m  ,i)>1K>1))       \(p*(;t),q*(-,t))  for  T}  <t<T2. 

UPi'i')  and  q(-,-)  are  feasible  for  the  T^-pcviod  CSEG,  then  they  arc  also  optimal. 

Proof:  Let  E[N(T)]  be  the  expected  payoff  and  V(T)  be  the  value  of  the  T-period  CSEG.  Since 
p(-,-)  is  optimal  for  the  Ti-period  game.  E[N(T\)]  >  V'(7\)  when  player  1  uses  p{-.-)  and  player  2 
uses  any  feasible  mixed  strategy.  Since  ;5(-.  •)  agrees  with  p(-,  •)  for  t  <  Ti,  the  same  can  be  said  of 
p(v)-  Therefore  if  player  1  uses  p(v), 

E[N(T2)]>V(T1)+    Y,    »(*)■ 

Likewise  if  player  2  uses  ?(•,•), 

E[N(T2)]<V(T1)+    f;    v(t). 

t=Ti+l 

The  theorem  follows.  Furthermore,  the  value  of  the  T2-period  CSEG  is 

V{T2)  =  V(T1)+    Y    *(')•■ 


If  (i)  A{-.  •,/)  does  not  actually  depend  on  /,  then  neither  will  p"(-,t)  nor  qm(-,t).  Additionally, 
if  (ii)  the  path  constraints  allow  both  players  to  remain  stationary,  then  these  two  ''equilibrium" 
distributions  will  be  feasible  at  t  +  1  if  they  are  feasible  at  t.  Finally,  if  (i)  and  (ii)  hold  plus 
p*(-,-)  and  q*(-,-)  are  feasible  at  time  t,  then  p*(-,-)  and  q*(-,-)  are  feasible  and  optimal  marginal 
distributions  from  t  onward.  Solving  the  CSEG  can  then  be  viewed  as  programming  the  two  sides  to 
move  from  the  given  initial  position  distributions  to  equilibrium  distributions.  Only  the  transient 
phase  presents  any  computational  difficulty;  once  the  equilibrium  distributions  are  encountered, 
they  are  feasible  and  optimal  from  that  point  on.  We  now  turn  to  methods  for  solving  specific 
CSEGs. 

3.  The  Brown-Robinson  Method 

In  Robinson  (1951),  the  method  of  fictitious  play  was  shown  to  iteratively  solve  two-person  zero-sum 
matrix  games.  This  procedure  had  been  suggested  earlier  by  G.  W.  Brown.  To  describe  fictitious 
play,  let  player  1  be  the  row  (maximizing)  player  and  player  2  be  the  column  (minimizing)  player. 
Rows  and  columns  correspond  to  the  pure  strategies  (tracks)  described  earlier.  If  player  1  selects 
row  i  and  player  2  selects  column  j,  then  reward  aij  is  paid  from  player  2  to  player  1.  In  each 
fictitious  play  of  the  game  (except  the  first),  the  players  select  the  best  pure  strategy  response  to 
the  empirical  mix  of  the  opponent's  pure  strategies  observed  so  far.  So  at  play  k  >  2,  player  1 
chooses  the  pure  strategy  Xk  (a  vector  where  every  component  but  one  is  0)  that  is  a  best  response 
to 


Vk 


1  1 


k 

t=i 


where  yt  is  the  pure  strategy  played  by  player  2  at  time  t.  Then  player  2  chooses  the  pure  strategy 
yk,  which  is  the  best  response  to  the  updated  row  average 


1    k  1 

xk  =  T^,xt     -    Xk-1  +  j(xk-  Zfc-l)- 

t=l 

Any  limit  points  of  the  sequences  {x^}  and  {yk}  are  solutions  to  the  game.  Also  upper  and 
lower  bounds  on  the  value  of  the  game,  v,  are  determined  at  each  game  play.  Specifically,  at  game 
play  k, 

2k  =  {xkfAyk  <v<  (xkfAyk  =  vk, 

and  both  vk  and  vk  converge  to  v.    Fictitious  play  begins  with  the  players  selecting  arbitrary 
strategies  (pure  or  mixed)  x\  —  X\  and  y\  =y1. 

We  note  that  to  solve  a  matrix  game  by  fictitious  play,  each  player  need  only  be  able  to  select 
a  best  pure  strategy  response  to  any  mixed  strategy  and  evaluate  the  expected  return.  For  CSEGs. 
this  means  that  for  fictitious  play  number  k  >  2,  player  1  must  be  able  to  first  update  the  running 


average  of  the  previously  observed  k  —  1  pure  strategies  played  by  player  2,  and  then  solve  the  / 
period  longest  path  problem  giving  the  best  pure  strategy  response  for  player  1.  Similarly,  player  2 
must  be  able  to  update  the  running  average  of  the  previously  observed  k  pure  strategies  pi; 
by  player  1,  and  then  solve  the  shortest  path  problem  giving  the  be  '   pure  strategy  response  for 
player  2.  The  procedure  begins  with  both  players  selecting  arbitrary  T-period  strategies. 

The  Brown -Robin  son  method  is  notorious  for  converging  very  slowly  to  the  optimal  solution. 
However  the  simplicity  of  the  updating  procedure,  which  allows  solution  of  moderate  sized  problems 
on  microcomputers,  makes  it  appealing  for  CSEGs. 

4.  The  Linear  Programming  (LP)  Method 

It  has  been  mentioned  that  CSEGs  could  conceivably  be  solved  with  LP  methods  if  all  pure  strate- 
gies are  enumerated.  In  this  section  an  LP  formulation  is  presented  which  does  not  require  this 
explicit  enumeration  yet,  unlike  fictitious  play,  solves  the  game  exactly. 

To  set  up  the  LP,  first  let  g(j,t)  be  the  smallest  possible  payoff  accumulated  over  periods 
t,t  +  1, . . .  ,T,  given  that  player  2  starts  in  cell  j  at  time  t  and  that  player  l's  mixed  strategy  has 
marginals  />(•,  •).  Then 

gti,t)  =  Y,A{iJ,t)p(i,t)  +    win    a(kj  +  l).  (1) 

Since  player  2's  location  at  time  1  is  specified  by  the  distribution  q(-).  player  l"s  object  is  to 

maximize  E[N]  =  X^iec  Q(*)9(h  !)• 

The  feasibility  (i.e.,  path)  constraints  are  incorporated  by  introducing  u(i,j,t)  as  the  proba- 
bility that  player  1  visits  cell  i  at  time  /  and  cell  j  at  time  t  +  1.  Then  the  marginal  variables  p(-,  ■) 
can  be  dispensed  with  because 

p(M)=     Y,     <iJ,t);ieC,t  =  l,...,T-l;  (2) 

J€S(t,t) 


or  alternatively, 


P(i,t)=     Y,    «(i. «.*-!);  iec,t  =  2 T.  (3) 

jes"(t,t) 


Here  S*(i,t)  =  {j  G  C\i  €  S(j,t-  1)}  for  i  in  C  and  t  =  2, . . .  tT  is  "the  set  of  cells  player  1  might 
have  come  from."  This  is  distinguished  from  S(i,t).  which  is  uthe  set  of  cells  to  which  player  1 
might  go."  As  long  as  the  right  hand  sides  of  (2)  and  (3)  are  equal,  the  common  value  is  a  feasible 
marginal  distribution  for  player  1.  Using  only  the  u(-,-,-),  #(•••)-  an(l  P{'*T)  variables,  player  l's 
problem  is  the  following  LP  (the  indicated  dual  variables  will  later  be  associated  with  player  2's 
LP): 


maximize    V^  q(i)g(i,  1 ) 
subject  to:  dual  variables 

53    tt(t,*,i)  =  p(t);  i'gC  fc(*,i)       (4) 

fc6S(t,l) 

~53     u(j,t,*-l)+     53    u(i,fc,<)  =  0;  »eC,*  =  2,...,T-l  fc(»,t)        (5) 

iG5*(t,t)  fc€5(t,t) 

-53    w(i,i,r-i)+p(i,r)  =  o;  tec  &(t,r)      (6) 

jes-(i,T) 

-  ^  A(i,k,T)p(k,T)  +  g(i,T)  <  0;  i  e  C  q(i,T)        (7) 

-J3^(,-,j,/)  53    tt(t,/,t)  -  »(*,<  + 1)  +  g(j,t)<0;  jeC,keE(k,t),      v(j,kj)         (8) 

i€C  l€S(i,t)  /=1,...,T-1 

u(ij,t)  >  0;  i,j  EC,  t=  1,...,T-  1 
p(i,T)  >  0;  i  G  C 

Constraints  (4)  enforce  the  starting  condition  p(i,  1)  =  p(i);  constraints  (5)  enforce  the  equality  of 
(2)  and  (3);  constraints  (6)  and  (7)  are  the  appropriate  terminal  conditions  for  p(i,T)  and  g(i,T); 
and  constraints  (8)  areimplied  by  (1).  A  proof  that  (8)  and  (1)  are  actually  equivalent,  and  that  the 
solution  of  the  LP  is  therefore  the  solution  of  the  game,  could  be  based  on  an  inductive  argument 
that  the  objective  function  cannot  be  maximal  unless  at  least  one  (8)-type  constraint  is  tight  for 
each  (j,t).  However,  it  is  simpler  to  merely  observe  that  the  solution  of  this  LP  is  in  any  case  a 
lower  bound  on  the  value  of  the  game,  and  to  conclude  equality  from  the  fact  that  the  dual  of  this 
LP  is  the  corresponding  minimization  problem  for  player  2. 

This  duality  relationship  will  also  allow  us  to  identify  the  optimal  solution  for  one  player  from 
the  optimal  dual  variables  in  his  opponent's  LP.  To  see  this,  let  v(i,j,t)  be  player  2's  counterpart 
to  u(i,j,t),  and  let  h(i,t)  be  the  maximum  obtainable  expected  total  reward  when  player  1  starts 
in  cell  i  at  time  t  and  player  2  uses  v(-,-,-).  Then  the  problem  player  2  must  solve,  which  is  the 
dual  of  player  l's  LP,  is 


minimize     y    p(i)h(i.  1 ) 
tec 

subject  to:  dual  variab 

Y,     v(i,k,l)  =  q(i);  i(EC  g(iA,         (9) 

k£E{i,l) 

~Y     v(j,i,t-l)+     Y     v(i,k,t)  =  0;  teC,t  =  2,...,r-l  $(i,<)       (10) 

j€E"(i,t)  k£E(i,t) 

-J2      v(j,i,T-l)  +  q(i,T)  =  0;  i(EC  g(i,T)       (11) 

j€E'(i,T) 

-^A(i,fc,T)g(fc,r)  +  /i(i,T)>0;i6C  p(»,T)      (12) 

tec 

-^A(»,j,*)  ^    »(;,/,<)  -  ft(M+l)  +  /»(«»<)  >  0;  ieC,k£S(i,t),       u(i,k,t)      (13) 
j€C  ieE(j,t)  i  =  l,...,T-l 

»(*,;',<)  >0;  t\j  €C,  f  =  1,...,T-1 

g(*,r)  >0;ieC 

Player  l's  LP  can  be  made  smaller  by  using  (6)  to  solve  for  p(i.T)  and  then  substituting  into 
(7).  This  eliminates  constraints  (6)  and  variables  p(i,T).  Likewise  constraints  (11)  and  variables 
q(i,T)  can  be  eliminated  from  player  2's  LP.  After  these  simplifications,  the  number  of  variables 
in  player  l's  LP  is  the  number  of  nodes  plus  the  number  of  arcs  in  the  X-period  network  specified 
by  constraints  (4)  and  (5).  Similarly,  the  number  of  variables  in  player  2*s  smaller  problem  is 
the  number  of  nodes  plus  arcs  defined  by  constraints  (9)  and  (10).  Furthermore,  the  number  of 
constraints  in  one  player's  LP  is  equal  to  the  number  of  variables  in  his  opponent's  problem.  So 
for  both  players,  the  number  of  variables  and  constraints  expands  linearly  with  T  rather  than 
exponentially.  Thus  for  other  than  very  small  problems,  solving  these  LPs  is  less  burdensome  than 
the  "brute  force"  LP  procedure  mentioned  earlier. 

When  compared  to  fictitious  play,  the  LP  procedure's  primary  advantage  is  that  exact  answer- 
are  produced.  One  would  expect  to  resort  to  fictitious  play  only  when  the  LP  problem  size  exceeds 
the  capability  of  available  LP  solvers. 

5.  The  One-Dimensional  Game 

Consider  a  CSEG  where  2r?  cells  (n  >  1)  are  arranged  linearly  with  the  searcher  (player  1)  initially 
in  cell  1  and  the  evader  (player  2)  initially  in  cell  2n.  Transitions  to  neighboring  cells  are  possible, 
or  either  party  may  remain  stationary.  Thus,  except  for  end  cells  1  and  2».  £"(/./)  =  S(i.t)  = 
E*(i,t)  =  S*(iJ)  --  {?'  -  I,?'.?'  +  1}  for  all  /.  The  payoff  at  time  t  is  1  if  searcher  and  evader  are  in 
the  same  cell,  otherwise  0.  The  equilibrium  distributions  p*(  •.  /)  and  q*(-,t)  are  easily  demonstrated 


to  be  uniform,  so  for  large  T  we  expect  the  value  of  the  T-period  game  to  be  vn(T)  =  T/2n  —  Kn 
for  some  Kn.  Questions  of  interest  are: 

•  Is  Kn  predictable,  and  what  does  "large  T"  mean? 

•  What  is  the  nature  of  the  optimal  strategies? 

One  reasonable  strategy  for  the  evader  is  what  we  will  call  "spreading."  The  idea  is  to  achieve 
the  equilibrium  distribution  as  fast  as  possible,  and  while  doing  so  to  assure  that  every  cell  feasible 
for  the  searcher  contains  at  most  the  equilibrium  probability.  Spreading  is  not  feasible  in  every 
CSEG,  but  the  evader  has  no  trouble  employing  it  in  the  game  under  consideration.  Figure  1  shows 
a  spreading  strategy  when  there  are  four  cells. 
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Figure  1.  Evader  "spreads"  unit  probability  over  4  cells.  Cells  not  feasible  for 
the  searcher  are  shaded. 


Since  the  searcher  can  obtain  nothing  on  the  first  2  opportunities  and  at  most  .25  per  look  on  the 
third  and  subsequent  opportunities,  V2(T)  <  (T  —  2)/4  for  T  >  2.  Therefore  Ko  >  -5.  In  fact 
vn(T)  <  (T  —  n)/2n  for  T  >  n  because  evader  spreading  is  feasible  for  any  n,  so  Kn  >  .5  for  n  >  1. 
Searcher  spreading  is  also  feasible  here.  Against  searcher  spreading  the  evader's  best  strategy 
is  to  simply  remain  stationary,  in  which  case  there  is  no  payoff  for  the  first  2n  —  1  time  periods. 
Therefore  vn(T)  >  (T  -  (2n  -  l))/2n  for  T  >  2n  -  1,  and  hence  Kn  <  (2n  -  l)/2n.  Thus 


.5  <  Kn  < 


2n 


2n 


<  1. 


(14) 


For  all  T,  spreading  is  optimal  for  both  sides  when  n  =  1.  It  also  turns  out  to  be  optimal  for  the 
evader  when  T  —  2n,  a  game  that  is  of  some  interest  because  2n  is  the  smallest  value  of  T  such  that 
the  solution  is  not  trivial.  To  see  this,  note  first  that  we  have  already  established  that  vn(2n)  <  .5. 
The  searcher  can  also  guarantee  a  payoff  of  .5,  but  by  "rushing"  rather  than  spreading.  In  rushing, 
the  searcher  essentially  charges  from  one  end  to  the  other  at  top  speed,  except  that  for  all  /  such 
that  2  <  t  <  2n  he  must  be  equally  likely  to  occupy  cells  t  and  t  —  1;  the  split  is  required  to 


prevent  the  possibility  that  the  evader  might  pass  by  without  coincidence.  By  rushing,  the  searcher 
guarantees  that  the  probability  of  a  coincidence  somewhere  in  the  first  2n  periods  is  at  lea 
vn(2n)  >  .5.  Therefore  vn(2n)  =  .5.  since  the  opposite  inequality  has  already  been  established. 
Obviously  the  searcher  could  continually  rush  from  one  end  to  the  other,  obtaining  a  p; 
of  .5  for  every  2n  time  periods.   This  is  not  attractive  when  T  is  large,  however,  since  a  uniform 
distribution  will  in  the  long  run  obtain  a  payoff  of  l/2n  per  time  period.  The  searcher's  dilemma 
is  that  rushing  and  spreading  each  have  their  attractions.    Unfortunately  the  two  strategies  are 
incompatible  in  that  rushing  retains  a  concentrated  distribution,  wheres  spreading  aims  for  unifor- 
mity. This  dilemma  does  not  exist  for  the  evader,  since  spreading  is  optimal  for  T  =  2n  and  also 
attractive  in  the  long  run.    One  might  therefore  expect  that  Kn  would  be  closer  to  .5  than  to  1 
in  (12).   This  turns  out  to  be  the  case.   Table  1  shows  Kn  for  1  <  77  <  6  as  established  with  lin- 
ear programming  formulations  generated  by  the  General  Algebraic  Modeling  System  (GAMS)  and 
solved  with  MINOS  (Modular  In-core  Nonlinear  Optimization  System)  on  the  NPS  IBM  3033AP 
mainframe  computer. 


n 

Kn 

T 

1 

.5000 

2 

2 

.5357 

6 

3 

.5431 

10 

4 

.5440 

13 

5 

.5459 

14 

6 

.5459 

19 

Table  1.  Kn  and  Tn  for  n  =  1. 


6. 


Additionally  Tn  is  listed,  which  is  the  first  time  both  probability  distributions  become  uniform.  Tn 
is  remarkably  close  to  3n,  but  is  not  3n  exactly.  When  n  =  3.  player  1  can  only  force  a  payoff  of 
9/6  —  .5433  at  time  9  if  uniformity  at  time  9  is  forced. 

Figure  2  shows  how  the  searcher's  probability  distribution  p(-,  •)  evolves  with  time  when  there 
are2n  =  12  cells  and  T  is  19  or  larger.  The  first  six  time  periods  are  not  shown  because  p{tj)  =  1  for 
time  t  <  6;  the  searcher  moves  forward  at  top  speed  as  long  as  contact  with  the  evader  is  physically 
impossible.  Equilibrium  first  appears  at  time  19.  The  searcher's  motion  might  reasonably  be 
characterized  as  a  compromise  between  rushing  and  spreading. 

Figure  3  shows  the  evader's  probability  distribution  <?(•.•)•  The  highest  probability  is  in  cell 
12  at  time  11,  the  last  time  at  which  the  searcher  is  guaranteed  not  to  be  there.  That  probability 
(.306)  is  evenly  divided  between  cells  11  and  12  at  time  12.  and  then  spreads  out  from  there.  The 
equilibrium  distribution  first  appears  at  time  18.  Note  that  the  probability  in  low  numbered  ■ 
goes  through  a  maximum.  This  also  happens  with  the  searcher  (p(l,17)  =  .0899  >  p(1.19)  = 
.0833).  but  much  more  weakly. 
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6.  A  Two-Dimensional  Example 

Now  consider  an  8-time  period  problem  where  a  searcher  and  a  evader  move  over  a  5  X  5  grid  of  cells. 
The  searcher  begins  in  the  upper  left  cell  and  the  evader  begins  in  the  lower  right  cell.  The  searcher 
detects  the  evader  with  certainty  if  they  share  the  same  cell.  Both  players  can  move  between  cells 
in  a  single  time  period  if  the  cells  share  a  side  or  a  corner.  This  problem  has  approximately  381,000 
pure  strategies  (i.e.;  feasible  paths)  for  each  player.  It  can  be  solved  with  linear  programming  but 
is  large  enough  to  make  the  Brown-Robinson  method  attractive — especially  if  a  microcomputer 
solution  is  desired. 

The  Brown- Robinson  procedure  for  this  problem  was  programmed  in  Fortran  77  on  a  Mac- 
intosh IIx  computer.  After  40,000  fictitious  plays,  mixed  strategies  for  both  sides  were  generated 
winch  bounded  the  value  of  the  game  between  .1845  and  .1938.  On  this  microcomputer,  approxi- 
mately 5  fictitious  plays  per  second  were  accomplished.  Figure  4  indicates  the  rate  of  convergence 
of  the  bounds. 
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0.4- 

Expected     0.3  - 

Payoff       0.2  - 


0.1- 
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i         |         i         |         i         |         i         |         i         | 
0  2000  4000  6000  8000  10000 

Number  of  Game  Plays 

Figure  4.  Bounds  on  the  Value  of  the  Game  Generated  by  Fictitious  Play. 

The  same  problem  was  solved  exactly  using  linear  programming.  Required  were  1383  variables 
and  the  same  number  of  constraints.  An  optimal  solution  was  obtained  after  2385  pivots  and  used 
approximately  410  CPU  seconds  on  the  NPS  mainframe.  The  value  of  the  game  is  .1891.  Optimal 
marginal  distributions  for  the  searcher  and  evader  (xlOOO)  are  shown  in  Figures  5  and  6. 

Since  any  «(•,•,•)  and  »(•,•»')  will  be  optimal  if  they  satisfy  the  path  constraints  and  have 
optimal  marginal  distributions,  it  is  reasonable  to  suspect  that  this  problem  might  have  many 
optimal  solutions.  This,  in  fact,  is  the  case.  Even  the  marginals  are  not  unique.  For  example, 
any  marginal  distribution  for  the  evader  at  time  2  is  optimal  if  it  "connects"  optimal  marginals  at 
times  1  and  3.  Figures  5  and  6  show  optimal  solutions  with  diagonal  symmetry,  but  this  symmetry 
was  forced  for  esthetic  reasons  by  adding  additional  constraints. 

In  this  problem,  the  equilibrium  distribution  of  .04  in  each  cell  is  reached  at  time  8  for  both 
players.  For  the  evader,  this  distribution  is  a  feasible  extension  of  his  optimal  marginal  distribution 
at  time  6.  Were  that  true  at  time  6  for  the  searcher  as  well,  then  equilibrium  would  have  been 
reached  one  time  period  earlier  at  time  7.    Instead  the  evader  concentrates  his  effort  at  time  7 
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Figure  5.  Searcher's  Marginal  Distribution  (xlOOO). 


Time  Period  1 


iaoo 

5 

12 

12 

56 

56 

12 

12 

12 

.56' 

'56 

12 

12 

12 

:56 

'56 

.56 

.56 

56 

56. 

.5.6. 

56 

56 

'56' 

56 

■5-6- 

:5'00 

500 

6 

25 

25 

■43' 

43 

43: 

25 

25 

43 

4  3 

43 

A3 

43: 

43 

43- 

•43 

43 

43. 

.43. 

43 

43 

43 

43 

43 

43 

43 

143 

143' 

143 

143 

143 

143 

14  3 

7 

99 

38 : 

38: 

38 

:3& 

.38 

38  . 

38 

38 

38 

38 

38  - 

38 

38 

38 

38. 

SR  • 

3ft 

38  \ 

38 

38 

38 

38 

•38 

38  ' 

12 

12 

129 

12 

12 

12 

12  9 

12 

12 

12 

129 

129 

129 

129 

12  9 

8 

4a 

4.0 

•; 

4C 

40 

40 

40 

40 

■■ 

40 

40 

■40 

40 

40 

40 

40 

40 

40 

Figure  6.  Evader's  Marginal  Distribution  (xlOOCT 


in  the  upper  left-hand  corner,  taking  advantage  of  a  low  searcher  marginal  level  there.  For  all 
times  t  >  8,  the  equilibrium  distribution  is  optimal  for  both  players,  and  the  value  of  the  game  is 
.1891  +  .04(t  -  8). 
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